Check that all the directories for the .nc files got made
source_dl <- dir(here("data_raw", "CMIP6"))
source_id <- idx$source_id %>% unique() %>% str_to_lower() %>% str_replace_all("-", "_")
stop_if_not(!any(!source_id %in% source_dl))
Check that all the corresponding .csv files exist
csvs <- list.files(here('data'))
stop_if_not(!any(!paste0(source_id, "_data.csv") %in% csvs))
For this analysis, I only want to use models with pr,
tas, hfss, and hfls variables in
all 5 scenarios (historical, ssp126, ssp245, ssp370, and ssp585)
Perform necessary calculations to compare PET and SPEI among models and between models and observed.
For PET, I’m using the “energy-only” method proposed by Milly and Dune (2016) eq. 8:
\[ PET = 0.8(R_n - G) \]
Except that in their notes, they estimate \(R_n -G\) as hfls +
hfss after converting to units of mm/day using the latent
heat of vaporazation of water, given by their eq. 2:
\[ L_v(T) = 2.501 - 0.002361T \] in MJ/kg
For the observed data and the CMIP6 data from the same period, I calculate 3-month SPEI using precipitation and PET.
Because SPEI, the variable of interest, is standardized, we looked
for models that captured seasonality of precipitation rather than
focusing on how well they estimate the exact amounts of precipitation.
We calculated mean precipitation for January through December and then
calculated a correlation coefficient between the 12 monthly means from
each CMIP model and the 12 observed means. We eliminated models with
correlation coefficients less than 0.6 for precipitation. Additionally,
SPEI takes evapotranspiration into account, specifically through
hfls and hfss. Because these variables were
not available for observed data, we used temperature but with a less
stringent cutoff of correlations greater than 0.4. Additionally we
calculated SPEI for each model and counted the number of droughts (SPEI
< -1) in each month. However, SPEI was not used to determine which
models remained in our ensemble because we did not have strong
expectations that accuracy of past SPEI or drought frequency is a good
measure of GCM model skill in predicting future SPEI.
| Comparison of observed data to CMIP6 'historical' output | ||||||
|---|---|---|---|---|---|---|
| Data only from 1980 to 2015 to match observed. | ||||||
| Source | Seasonality (monthly means)1 | Droughts (SPEI < -1) | ||||
| pr cor | precipitation | tas cor | temperature | mean ± SD duration (months) | drought seasonality | |
| observed2 | 1.00 | 1.00 | 3.4±2.8 | |||
| cas_esm2_0 | 0.93 | 0.93 | 2.1±1.6 | |||
| fgoals_f3_l | 0.94 | 0.86 | 2±1.4 | |||
| awi_cm_1_1_mr | 0.80 | 0.80 | 2.4±1.3 | |||
| fgoals_g3 | 0.80 | 0.73 | 2.7±2.1 | |||
| taiesm1 | 0.77 | 0.79 | 3±2.5 | |||
| cmcc_esm2 | 0.72 | 0.77 | 2.2±1.5 | |||
| access_esm1_5 | 0.68 | 0.67 | 2.1±1.9 | |||
| cmcc_cm2_sr5 | 0.60 | 0.71 | 2.5±2.2 | |||
| canesm5 | 0.50 | 0.37 | 3.1±2.5 | |||
| bcc_csm2_mr | 0.14 | 0.63 | 2.2±1.3 | |||
| ec_earth3_veg_lr | 0.25 | 0.32 | 2.6±2 | |||
| iitm_esm | 0.25 | −0.04 | 2.2±1.3 | |||
| access_cm2 | 0.05 | 0.12 | 2±1.2 | |||
| cams_csm1_0 | 0.20 | −0.09 | 2±1.2 | |||
|
1
Red numbers highlight correlations (Pearson's r) < 0.6 for precipitation and < 0.4 for mean temperature.
2
Observed data from Xavier et al. (2016)
|
||||||
Check for overlap in the date ranges of historical experiment and SSPs
| Pointblank Validation | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| [2022-03-02|15:40:44]
tibble
df_datesWARN
—
STOP
1
NOTIFY
—
|
|||||||||||||
| STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | S | N | EXT | ||
| 1 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 2 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 3 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 4 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 5 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 6 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 7 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 8 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 9 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 10 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 11 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 12 | col_vals_gt()
|
|
✓ |
4 |
00.00 |
41.00 |
— |
● |
— |
||||
| 13 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 14 | col_vals_gt()
|
|
✓ |
4 |
41.00 |
00.00 |
— |
○ |
— |
— | |||
| 2022-03-02 15:40:44 EST 12.2 s 2022-03-02 15:40:56 EST | |||||||||||||
For fgoals_g3 The historical experiment ends December
2016, a year after other models historical experiments. The SSPs start
on January of 2016.
fgoals_g3 also has duplicated dates within the
historical experiment
fgoals_hist <-
bigdf %>%
filter(source_id == "fgoals_g3", experiment_id == "historical")
dupes <-
fgoals_hist %>%
filter(duplicated(date)) %>% pull(date)
fgoals_hist %>%
filter(date %in% dupes) %>%
arrange(date)
## # A tibble: 84 × 13
## source_id experiment_id time hfls hfss pr tas tasmax
## <chr> <chr> <dttm> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 fgoals_g3 historical 2015-02-15 05:00:00 59.4 92.4 34.7 31.0 38.7
## 2 fgoals_g3 historical 2015-02-15 05:00:00 59.4 92.4 34.7 31.0 38.7
## 3 fgoals_g3 historical 2015-02-15 05:00:00 59.4 92.4 34.7 31.0 38.7
## 4 fgoals_g3 historical 2015-02-15 05:00:00 59.4 92.4 34.7 31.0 38.7
## 5 fgoals_g3 historical 2015-03-16 16:00:00 69.3 64.8 111. 28.6 35.5
## 6 fgoals_g3 historical 2015-03-16 16:00:00 69.3 64.8 111. 28.6 35.5
## 7 fgoals_g3 historical 2015-03-16 16:00:00 69.3 64.8 111. 28.6 35.5
## 8 fgoals_g3 historical 2015-03-16 16:00:00 69.3 64.8 111. 28.6 35.5
## 9 fgoals_g3 historical 2015-04-16 04:00:00 70.2 59.9 145. 28.6 35.6
## 10 fgoals_g3 historical 2015-04-16 04:00:00 70.2 59.9 145. 28.6 35.6
## # … with 74 more rows, and 5 more variables: tasmin <dbl>, pet <dbl>, cb <dbl>,
## # spei <dbl>, date <date>
Something went wrong in the wrangling as a result of the overlap
between historical and SSPs. Be sure to filter fgoals_g3 to
remove the last year from the historical experiment.
| Pointblank Validation | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| [2022-03-02|15:40:58]
tibble
bigdfWARN
0.02
STOP
—
NOTIFY
—
|
|||||||||||||
| STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | S | N | EXT | ||
| 1 | Check for reasonable temperature values
|
|
✓ |
28K |
28K1.00 |
00.00 |
○ |
— |
— |
— | |||
| 2 | Check for reasonable temperature values
|
|
✓ |
28K |
28K0.99 |
90.01 |
○ |
— |
— |
||||
| 3 | Check for reasonable temperature values
|
|
✓ |
28K |
28K1.00 |
00.00 |
○ |
— |
— |
— | |||
| 4 | Check for reasonable precip values
|
|
✓ |
28K |
28K0.99 |
1470.01 |
○ |
— |
— |
||||
| 5 | col_vals_not_null()
|
— |
|
✓ |
88K |
88K1.00 |
00.00 |
○ |
— |
— |
— | ||
| 6 | col_vals_not_in_set()
|
|
✓ |
88K |
88K0.99 |
1480.01 |
○ |
— |
— |
||||
| 2022-03-02 15:40:58 EST 5.4 s 2022-03-02 15:41:03 EST | |||||||||||||
All of the failing tests for temperature are
access_esm1_5, which has 9 tasmax values above
45ºC (max 48ºC). Most of the failing precipitation rows are also for
access_esm1_5, which predicts ~100 months with
precipitation > 400mm (max 594mm) in the historical experiment.
Infinite values for SPEI are essentially just beyond the range of
quantification. bcc_csm2_mr and fgoals_f3_l
have the largest number of -Inf values for SPEI.
Below are plots of all data downloaded from each CMIP6 source.